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We discuss various ensembles of homogeneous complex networks and a Monte-Carlo method of 
generating graphs from these ensembles. The method is quite general and can be applied to simulate 
micro-canonical, canonical or grand-canonical ensembles for systems with various statistical weights. 
It can be used to construct homogeneous networks with desired properties, or to construct a non- 
trivial scoring function for problems of advanced motif searching. 



I. INTRODUCTION 



Complex networks is a new emerging branch of random graph theory. For a long time random graphs have been 
mainly studied by pure mathematics but recently due to the availability of ernpirical data on real-world networks they 
have attracted the attention of physics and natural sciences (see for review Methods of statistical physics, 

both empirical and theoretical, have thus begun to play an important role in this research area. 

The empirical observations of real-networks has had a feedback on theoretical development which now concentrated 
on the understanding of the observed features. For example fat tails in node degree distribution, small world effect, 
degree-degree correlations, or high clustering. Two complementary approaches have been developed: diachronic, 
known as growing networks 0, 0,13 , and synchronic being a sort of statistical mechanics of networks IB, 1^ - 

We will discuss here the latter. This approach is a natural extension of Erdos and Renyi ideas [lOl It is well 
suited both for g rowing (causal) networks for which nodes' labels reflect the causal order of nodes' attachment to the 
network |l2l and for homogeneous networks for which nodes' labels can be permuted freely in an arbitrary way. 
Here we shall discuss mainly homogeneous networks. We shall shortly comment on causal networks towards the end 
of the paper. 

The main aim of the paper is to present a consistent picture of statistical mechanics of networks. Some ideas have 
already been introduced earlier. They are scattered in many papers and discussed in many different contexts. We put 
them together, add some new material and introduce a guideline to obtain a self-contained introduction to statistical 
mechanics of complex networks. 

The basic concept in the statistical formulation is statistical ensemble. Statistical ensemble of networks is defined 
by ascribing a statistical weight to every graph in the given set 0, 0] . Physical quantities are measured as weighted 
averages over all graphs in the ensemble. The probability of the occurrence of a graph in random sampling is propor- 
tional to its statistical weight. If the statistical weight changes then also the probability of occurrence of randomly 
sampled graphs will change and in effect different random graphs will be observed. The concept of statistical weight 
is crucial, since it defines randomness in the system. Statistical weight is built out of two ingredients: configuration 
space weight and functional weight. The configuration space weight is proportional to the uniform probability measure 
on the configuration space which tells us how to uniformly choose graphs in the configuration space. To illustrate the 
meaning of the uniform measure consider an ensemble of Erdos-Renyi graphs with N nodes and L links 10, 11] . The 

configuration space consists of ) graphs with labeled nodes. All those graphs are equiprobable, and therefore the 
configuration space weight is the same for each graph. It is convenient to choose this weight to be 1/iV! since then 
it can be interpreted as a factor which takes care of A^! possible permutations of nodes' labels. This factor has the 
same origin as the corresponding factor in quantum mechanics for indistinguishable particles and it is constant for all 
graphs in a finite A'^-ensemble. 

We can calculate the entropy of random graphs as 
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In the limit of large sparse graphs: N ^ oo and ^ = a = const > 2, the entropy is an over-extensive function of the 
system size: 

a — 2 

5== A^lnA^+..., (2) 

unlike in standard thermodynamics. 

Let us move to weighted graphs. The idea is to modify the Erdos-Renyi ensemble by introducing a functional weight 
which explicitly depends on graph's topology. For example, if we choose the functional weight to be a function of the 
number of loops on the graph, we can suppress of favor loops of typical graphs in the ensemble. In a similar way we 
can choose statistical weights to control the node degree distribution to produce homogeneous scale-free graphs Q 
or to introduce correlations between degrees of neig hboring nodes [iM llfl UJ, llg ■ 

Classical thermodynamics describes systems in equilibrium for which the functional weight is given by the Gibbs 
measure: ^ exp{—/3E), where E is the energy of the system. When discussing complex networks it is convenient to 
abandon the concept of energy and Gibbs measure and consider a more general form of statistical weights because 
many networks are not in equilibrium. Indeed, many networks emerging as a result of a dynamical process like growth 
are far from equilibrium 1, 2, 3]. It does not mean though that one cannot introduce a statistical ensemble of growing 
networks. On the contrary, one can for example consider an ensemble of networks which result of many independent 
repetitions of the growth process terminated when the network reaches a certain size. Such a collection of networks 
does not describe a thermodynamic equilibrium. The functional weight can be deduced from the parameters of the 
growth process but of course it has nothing to do with the Gibbs measure. 

In fact, many real-world networks result from a combination of a growth process and some thermalization processes. 
For example, the Internet grows but at the same time it continuously rearranges. The latter process introduces a sort 
of thermalization. Today the growth has probably still larger influence on the topology of the underlying network but 
in the future the growth may slow down due to saturation and then equilibration processes resulting from continuous 
rewirings will take over. Similarly all evolutionary networks emerge from a growth mixed with a sort of thermalization 
related to the continuous network rearrangement. Therefore it is convenient to have a formalism which can extrapolate 
between the two regimes in a flexible way. The approach which we propose here is capable of modeling functional 
properties of networks by choosing an appropriate functional weight. 

Let us return to the configuration space weight. As we mentioned this weight is equivalent to the uniform probability 
measure on the configuration space for which all graphs are equiprobable. It is a very crucial part of the construction 
of the ensemble to carefully specify what one means by equiprobable graphs. Consider first graphs with N nodes. 
There are at least two natural candidates for the uniform measure in such a set of graphs. Since one is interested in 
shape (topology) of graphs one can define all shapes to be equiprobable. Alternatively one can introduce labels for 
nodes of each graph to obtain a set of labeled graphs and then one can define all labeled graphs to be equiprobable. 
The two definitions give two different probability measures since the number of ways in which one can label graph 
nodes depends on graph's topology and thus the probability of occurrence of a given graph will depend on its topology 
too. It turns out that the latter definition is more natural. As we have seen above this definition leads to Erdos-Renyi 
graphs. So we stick to this definition and from here on we shall ascribe to each labeled graph the configuration space 
weight 1/7V! which is constant in the set of graphs of size N. 

The situation is more complex if one considers pseudographs that is graphs which have multiple connections (more 
than one link between two nodes) or self-connections (a link having the same node at its endpoints). In this case one 
can also label links and ascribe the same statistical weight to each fully labeled graph. For this choice the statistical 
weight of each graph is equal to the symmetry factor of Feynman diagrams generated in the Gaussian perturbation 
field theory Q . 

The paper is organized as follows. In the next section we will recall some basic definitions. Then we will discuss 
Erdos-Renyi graphs in the context of constructing statistical ensemble and later we will generalize the construction to 
weighted homogeneous graphs. After this we will describe Monte-Carlo algorithms to generate graphs for canonical, 
grand-canonical and micro-canonical ensembles and discuss their representation in terms of adjacency matrices. A 
section will be devoted to pseudographs. In the last section we will shorty summarize the paper. 



II. DEFINITIONS 



Let us first introduce some terminology. Graph is a set of N nodes (vertices) connected by L edges (links) . A graph 
need not be connected. It may have many disconnected components including empty nodes (without any link). If a 
graph has no multiple or self-connected links we shall call it simple graph or graph. An example is illustrated in Fig. 
n Later we shall also discuss graphs with multiple- and self-connections. To distinguish them from simple graphs we 
shall call them degenerate graphs or pseudographs. One can consider directed or undirected graphs. Directed graphs 



FIG. 1: An example of simple graph with = 6, L = 5. Vertices without links (like no. 2) are allowed. Each vertex can have 
at most N — 1 links. Positions of vertices in the picture are meaningless. The only information which matters is connectivity. 

are built of directed links while undirected of undirected ones. In this paper we shall discuss undirected graphs but 
the discussion can easily be generalized to directed ones as well. Sometimes we will find it convenient to represent an 
undirected link as two oriented links going in opposite directions. 

A simple graph can be represented by its adjacency matrix which is an x TV matrix whose entries Aij are equal 
one if there is a link between vertices i,j or zero otherwise. Since self-connections are forbidden we have An — 
on the diagonal. The adjacency matrix of an undirected graph is also symmetric because if there is a link i ^ j 
{Aij = 1), there must be also the opposite one j ^ i {Aji = 1). 

In this paper we want to construct statistical ensembles of homogeneous graphs having desired properties. We 
discuss three types of ensembles: ensemble of graphs with a fixed number of nodes A^ and varying number of links, 
ensemble with a fixed number of nodes N and a fixed number of links L, and finally ensemble of graphs with a given 
node degree sequence {qi, (72, ■ • • , ^jv}, which we shall call grand-canonical, canonical and micro-canonical ensembles, 
respectively. There are of course many other possibilities like for instance ensembles with varying number of nodes, 
or with a fixed number of loops etc, but the three mentioned above are encountered most frequently. To construct a 
statistical ensemble, for the chosen set of graphs, we have to specify statistical weight for each graph in the considered 
set. 

In the next section using the Erdos-Renyi graphs and binomial graphs we will deduce a general logical structure 
standing behind the construction of ensembles of homogeneous graphs and then we will use this structure to introduce 
ensembles with an arbitrary functional weight which explicitly depends on the node degrees. 

III. STATISTICAL ENSEMBLE FOR ERDOS-RENYI RANDOM GRAPHS 

For simplicity, we start from a well-known model of Erdos-Renyi's graphs 0, ^|. In this classical model one 
considers simple graphs with A^ labeled nodes and L links [s^ chosen at random out of all (^) possibilities. All 
possibilities are equiprobable and so are the corresponding graphs - understood as graphs whose vertices are labeled. 
Usually one is interested in unlabeled graphs that is in their shape or topology and not in their labeled version. To 
explain what is meant by graph's shape or topology, let us consider a simple graph shown in the upper part of Fig. 
121 Unlabeled graph (topology) on the left hand side of the figure is represented as labeled graphs on the right hand 
side. There are six possible realizations, but only three of them: A, C, E are distinct. B is the same as A since it 




FIG. 2: Top: the graph on the left can be realized as three different labeled graphs. A is equivalent to B, C to D and E to F. 
They are equivalent because they have the same adjacency matrix. Bottom: triangle-shaped graph has only one realization as 
labeled graph. 

can be obtained from A by a continuous deformation: one can continuously move the vertex 2, together with the link 
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FIG. 3: Three possible graphs for A'^ = 4, L = 3. The number of ways of labehng these graphs is: ha = 12, ns = 4, nc = 4. 

attached to it, to the position of the vertex 3, and at the same time the vertex 3 to the position of the vertex 2. Such 
a continuous deformation does not change graph's connectivity. The same holds for pairs: C, D and E, F. This can 
also be seen if we take into account the adjacency matrix A. Both A and B have identical adjacency matrices which 
are different from those for C, D and E, F: 

/01l\ /010\ /00l\ 

Aa = Ab = 1 , Ac = Ad = 1 1 , Ae = Ap = 1 . (3) 

\iooy yoioy \iioy 

Now we remove labels from all graphs in Fig. |2] to obtain two unlabeled graphs depicted on the left hand side. 
Although there are three distinct adjacency matrices for the upper shape, all of them lead to the same connections 
between vertices (unlabeled graph) . The graph in the lower line in Fig. [3 can be labeled only in one way [33| which 
is represented by the following adjacency matrix: 

/ 1 1 \ 

(4) 

Thus the triangular shape has only one realization as labeled graph. Furthermore, the upper and lower graphs are 
obviously distinct because none of the corresponding labeled graphs (adjacency matrices) representing the upper graph 
is equal to that for the lower graph. In this trivial case the difference is in the number of links. More generally, any 
two graphs are distinct if the underlying labeled graphs (adjacency matrices) cannot be converted one into another 
by a permutation of nodes' labels. It is clear that for the graphs in Fig. |21 there is no such a permutation but in 
general case the comparison of graphs may be a complex problem. 

Let us now apply the ideas sketched above to define an ensemble of graphs. As an example we shall consider 
Erdos-Renyi graphs with = 4, L = 3. It consists of three distinct graphs A, B, C shown in Fig. 13 Now we want to 
determine the statistical weight for those graphs. Adjacency matrices of the underlying labeled graphs are essentially 
different for A, B, C since they cannot be converted one into another by a permutation of node's labels. Each graph 
in Fig. 13 has a few possible realizations as labeled graph. One can label four vertices in 4! = 24 ways corresponding 
to permutations of 1 — 2 — 3 — 4. For the graph A, twelve of them give distinct labeled graphs. For example, the 
permutation 1 — 2 — 3 — 4 gives an identical labeled graph (adjacency matrix) as 4 — 3 — 2 — 1. The same kind of 
symmetry applies for remaining pairs of permutations. Therefore there are ha = 12 labeled graphs for A. Similarly 
one can find that there are ns — 4 labeled graphs for B and ric — 4: for C. Altogether, there are ua + + nc — 20 

labeled graphs in accordance with n — (^2'') ~ ^^'^ Erdos-Renyi ensemble labeled graphs are equiprobable, so 

the shapes A, B, C have the following probabilities: 

UA 3 7iB 1 nc I 

PA = = PB^ ^ PC ^ ^ T- (5) 

no no no 

These probabilities give frequencies of the occurrence of the shapes A, B, C in random sampling. We see that graphs 
(unlabeled graphs) are not equiprobable in Erdos-Renyi's ensembles. 

Let us denote the statistical weights for A, B, C by wa,wb,wc which are proportional to probabilities of configu- 
rations in the ensemble. In our case wa ■ wb '■ wc — Pa ■ Pb ■ Pc- There is a common proportionality constant in the 
weights. It is convenient to choose this constant in such a way that the weight of each labeled graph be 1/A''! [33 |. 
For this choice we have 

WA = 1/2, WB^l/6, (6) 

for the graphs in Fig. |3| This choice compensates for the increasing factor of permutations Nl, when one considers 
ensembles with varying A'^, and intuitively removes overcounting coming from summing over permutations of indis- 
tinguishable node's labels. However, one should remember that in general the number of distinct labeled graphs of 



a graph is less than A^! and therefore the weight of graph is smaUer than 1. The larger is the symmetry of a graph 
topology the smaller is the number of underlying labeled graphs and thus the smaller is the statistical weight (see for 
instance Fig. 

The partition function Z{N,L) for the Erdos-Renyi ensemble can be written in the form: 

ZiN^L)^ E M = E ^(")' (7) 

a'£lg{N,L) ' aGg(N,L) 

where lg{N,L) is the set of all labeled graphs with given N,L and g{N,L) is the corresponding set of (unlabeled) 
graphs. The weight w{a) = n{a)/N\, where n{a) is the number of labeled graphs of graph a. We are interested in 
quantities averaged over the ensemble. More precisely, we are interested in quantities which depend on topology of 
graph and not on node's labels. This means that if 0{a) is such an observable then for any two labeled graphs a'^ 
and a'2 of graph a we have O(a'i) — 0(02) = 0{a). The average is defined by 

(O) = — y O(a') A = , X V w{a)0{a). (8) 
^ ' Z(N,L) ^ ^ m Z(N,L) ^ > ^ ' ^ ' 

We shall refer to an ensemble with fixed N,L as to a canonical ensemble. The word "canonical" is used here to 
emphasize that the number of links L is conserved like the total number of particles in a container with ideal gas 
remaining in thermal balance with a source of heat. The partition function Z(N^ L) can be calculated by pure 
combinatorics as we have seen in the introduction. Now for completeness we derive it using the adjacency matrix 
representation of graphs. The adjacency matrices A are symmetric, they have zeros on the diagonal and L unities 
above the diagonal. Thus we have 



^(^,^) = EE--EEE-- E ^ 



A12 A13 AiN A23 A24. Aiv-i,jv L P<r 



L — ^ ^ Apr 



1/Nl, (9) 



where S[x] — 1 if x = and zero elsewhere. The sums are done over all Aij = 0, 1 for all pairs 1 < i < j < N . 
Using an integral representation of 5 [x] and exchanging the order of summation and integration we obtain the expected 
result: 



Z{N,L) = (1/^!)^ / dfce^'^'^ (1 + e-*'^')'^"^ = (1/A^!)^ f dk e'''^ 



711 — 



2 J \ ^ — ikm 



{i/m)[ 9V (10) 



L 

This method can be applied to calculate averages of various quantities. As an example consider the node degree 
distribution 7r(q) which tells us what is the probability that a randomly chosen vertex on random graph has degree q: 



TT{q) 



(11) 



By random graph we mean that we average over graphs from the given ensemble. We know that for Erdos-Renyi 
graphs 7T{q) is a Poissonian distribution in the limit of iV ^ 00: 



TT{q) 



■exp(-g), 



(12) 



where q = 2L/N is the average vertex degree. This result can be rederived using the method described above. Let us 
look at the degree of a vertex labeled by one. The result does not depend on the vertex label for homogeneous graphs 
since labels have no physical meaning. One can find that 



= ^(^]^EE---EE--- E ^ 

^ ' A12 Ai3 
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q-^Air 

r=2 



(13) 



which in the limit q — const, N oo gives Eq. (|12|l as expected. 

So far we have discussed the canonical ensemble of Erdos-Renyi graphs with L fixed. Erdos and Renyi introduced 
also a related model called binomial model where the number of nodes N is fixed but the number of links L is not 
fixed a priori. One starts from N empty vertices and connects every pair of vertices with a probability p. In this 

statistical ensemble the probability of obtaining a labeled graph with given L is P{L) ^ p^{l - Thus the 

partition function is 

2Wrt^E E i^ma))^(i-P)(?)E(T^)' E 1^ 

L aelg{N,L) L ^ ^ ' aelg{N,L) 

OC ^exp(-ML) Z(A^,L) OC ^exp(-^L + S'(iV,i)), (14) 

L L 

where = exp(— or fi = In^^, and the entropy S{N,L) is given by Eq. (Q. We skipped an L-independent 
factor in front of the sum in the second line substituting equality by proportionality sign. The weight of labeled 
graphs is w{a) = l/N\ exp(— /iL(a)), where fi is a constant which can be interpreted as chemical potential for links 
in the grand- canonical ensemble H14|l . One can calculate the average number of links or its variance as derivatives of 
the grand-canonical partition function with respect to fi: (L) = —d^ \nZ{N, /i) and (L^) — (L)^ = 9^ \nZ{N, fi). The 
sum of states can be done exactly: 

(2) -. /(N\\ 1 

Z{N,^.) = Y: e-'^'-J ^1>)=-{1 + (15) 

L=0 ■ \ / 

It is easy to see that for fixed finite fi the average number of links behaves as N'^ or more precisely as 

(D^p^^i^^^^^i^. (16) 

Thus for N ~> (X the graphs become dense. The mean value of node degree (q) = 2{L)/N increases to infinity. The 
situation changes when /z goes to infinity with increasing N. This happens in particular if the probability p scales as 
p ~ 1/A^ since then fj, behaves a.s fj, ^ \nN. In this case L is proportional to N (|16|l and both the terms and S in 
the exponent of Eq. (|14|l behave as In and compensate each other. The corresponding graphs become sparse and 
the mean node degree {q) — 2{L)/N is now finite. The situation in which fi scales as InA^ is very different from the 
situation known from classical statistical physics, where such quantities like chemical potential /i are intensive and do 
not depend on system size N in the thermodynamic limit iV — > 00. 

The difference between canonical and grand-canonical ensembles gradually disappears in the large N limit 'i', '^l . 
For canonical ensemble or sparse graphs the node degree q — 2L/N = a is kept constant when N 00 while in 
grand-canonical one it may fluctuate around the average (q) = 2{L)/N = a. However, the magnitude of fluctuations 
around the average disappears in the large N limit since 



and Aq = ^ {!?■) — {L)"^ / {L) ^ 1/N — » 0, so effectively the system selects graphs with q = a. 

Sometimes one also considers an ensemble of graphs with a predefined node degree sequence {qi, q2, ■ ■ . , qn}- We 
shall call it micro- canonical. Again, in the simplest case one assumes that all labeled graphs are equiprobable in this 
ensemble. Properties of random graphs in such an ensemble strongly depend on the degree sequence. 



IV. WEIGHTED HOMOGENEOUS GRAPHS 



In the previous section we described ensembles for which all labeled graphs had the same statistical weight. Random 
graphs in such ensembles have well known properties. It turns out, however, that most of these properties do not 
correspond to those observed for real world networks. One needs a more general set-up to define an ensemble of 
complex random networks. Such a set-up can be introduced as follows. One considers the same set of graphs as 
in Erdos-Renyi model but one ascribes to each graph a different statistical weight. In other words, one chooses a 
probability measure on the set of graphs which differs from the uniform measure. In the generalized ensemble, each 
graph in addition to the configuration space weight 1/7V! has a functional weight W{a). For homogeneous random 



graphs this weight is assumed to depend only on graph topology. This means that the weight does not depend on 
nodes' labels: if a'l and are labeled graphs of a then ly(ai) = Vl^(a2) = W(a). The partition function for a 
weighted canonical ensemble reads 

Z{N,L)^ il/Nl)Wia') - MaWia), (18) 

a'elg{N.L) aeg{N,L) 

where w{a) is the same factor w(a) — n{a)/Nl as before ((TJ, being just the ratio of the number of labeled graphs 
a' of a (obtained by permutations of nodes' labels giving distinct adjacency matrices) and the number of all nodes' 
labels permutations iV!. For Erdos-Renyi graphs the functional weight is M^(q!) = 1. 
The simplest non-trivial example is a family of product weights W: 

N 



W{a)^l[p{q,), (19) 

i=l 

where p{q) is a positive function depending on one node degree q. This functional weight does not introduce correlations 
between node degrees. We shall refer to random graphs generated by this partition function as uncorrelated networks. 
One should however remember that the total weight does not entirely factorize because the configuration space weight 
w{a) — n(a) /Nl written as a function of node degrees w{qi , 92, • ■ • , In) does not factorize. There is also another factor 
which prevents the model from the full factorization and independence of node degrees, namely this is the total number 
of links 2L = qi + q2 + ■ ■ ■ -\- qN which for given L and N introduces correlations between g^'s. For example, if one of 
g^'s is large, say of order 2L, then the remaining ones have to be small in order not to violate the constraint on the 
sum. For a wide class of weights p{q) one can however show that in the large N limit the probability that a randomly 
chosen graph has degrees qi, . . . ,qN approximately factorizes: 

N 

7r{qi,...,qN)^Y[7r{q,). (20) 

1=1 

For large iVjthe node degree distribution 7r(g) ()11|1 . that is the probability that a random node on random graph has 
degree g [1, IE 13 » can be approximated by 

n{q) = P^exp{-Aq-Bl (21) 

where parameters A, B are determined from the conditions for the normalization 7r(g) — 1 and for the average 
^^g7r(g) = g = 2LIN. For example, for p{q) ~ 1 which corresponds to Erdos-Renyi graphs one finds A — — Ing = 
ln2L/iV and B = q — 2L/N, therefore 7r(g) is given by the Poissonian from Eq. H12|l . 

Since the node degree distribution 7r(g) for weighted graphs (|19|l depends on p{q), one can choose the latter to 
obtain a desired form of the node degree distribution 7r(g). Let 7r(g) be a desired node degree distribution such that 

^7r(g)-l, g- = ^g7r(g). (22) 
q 

If we choose the weight ()19|) with 

p{q) = g!^(g) (23) 

in canonical ensemble with N nodes and L links, in the limit of ^ 00 and 2L/N — q we obtain homogeneous 
random graphs with this node degree distribution. In this case the constants A, B from Eq. (|21|l vanish automatically: 
A = B = 0. In particular by an appropriate choice of p{q) we can generate scale free graphs with the node degree 
Barabasi - Albert distribution |0|: 7r(g) — g(<j+i)(<j+2) 1 = ^^'^t ■ ■ ^^"^ ""(0) = as an ensemble of graphs L = N, 
q = 2 with p{q) — Q^--;^j^fp^j^ff2) for g = 1, 2, . . . and p(0) = 0. However, for finite A^ the node degree distribution n{q) 
deviates from the limiting shape due to finite size corrections, which are particularly strong for fat tailed distributions 
7r(g) ~ q~'^. The maximal node degree scales as qmax A^i/(t~i) for 7 > 3 and as qmax ~ N^^'^ for very fat tails: 
2 < 7 < 3 0, ll^l as a result of structural constraints which also lead to the occurrence of correlations between node 
degrees. 

One can define more complicated weights than those given by Eq. (|19|l . A natural candidate for networks with 
degree-degree correlations is the following weig ht @,[l3: 



W{a)=l[piqa„qb,), (24) 



1=1 



where the product runs over ah hnks of the graph, and the weight p{qaT<lb) is a symmetric function of degrees 
of nodes at the end points of the hnk. One can choose this function to favor assertive or disassertive behavior 
[E, ,15. ..id . Il7t IT^ . In a similar way one can introduce probabihty measures on the set of grap hs which mimic some 
other functional properties of real networks, like for example higher clustering pH I23I |24L l25j| . One can do this 
in micro-canonical, canonical, grand-canonical or any other ensemble. This is just the most general set-up to handle 
homogeneous networks. 



V. MONTE-CARLO GENERATOR OF HOMOGENEOUS NETWORKS 



Erdos-Renyi graphs are exceptional in the sense that one can calculate for them almost all quantities of interest 
analyti cally . This is not the case for weighted networks. Various methods have been proposed for generating random 
graphs i2&|. In this section we will describe a Monte-Carlo method which allows one to study a wide class of random 
weighted graphs experimentally by a sort of numerical experiments. The basic idea behind this type of experiments 
is to sample the configuration space of graphs with the probability proportional to the statistical weight or in other 
words to generate graphs with a desired probability. Again, the Erdos-Renyi graphs are exceptional because one can 
generate them one by one independently of each other. This is just because they are equiprobable. For weighted 
graphs the situation is not that easy since there are no efficient algorithms to pick up an element from a large set 
with the given probability. The naive algorithm which relies on picking up an element uniformly and then accepting 
it with the given probability has a very low acceptance rate. Therefore one has to use another idea. We will describe 
below how to generate graphs using dynamical Monte-Carlo technique. 

The idea is to run a random walk process in the set of graphs which visits configurations with a frequency pro- 
portional to their statistical weight. Mathematically, this means that one has to invent a stationary Markov chain 
(process) for which the stationary distribution is proportional to the statistical weights of graphs: ~ W{a)/Z . 

The Markov chain is defined by transition probabilities P{a (3) that the random walker will go in one step from 
a configuration (graph) a to [3. The probabilities are stored in a transition matrix P: Pafj = P{a — > (3) which is 
also called Markov's matrix. For a stationary process, the transition matrix P is constant during the random walk. 
Random walk is initiated from a certain graph ao and then elementary steps are repeated producing a sequence 
(chain) of graphs ao ^ Q^i ^ a2 ^ . . . . The probability p/5(i + 1) that a graph f3 will be generated in the {t + l)-th 
step of the Markov process can be calculated as 

Pp{t + l) = Y.Pc{t)Po.fi. (25) 

a 

The last equation can be written as 

p(t+l) = P^p(i), (26) 

where r denotes transposition, and p is a vector of elements p^. One should note that the stationary state: p{t+ 1) = 
p{t) corresponds to a left eigenvector of P to the eigenvalue 35] A = 1. If the process is ergodic, which means that any 
configuration can be reached by a sequence of transitions starting from any initial configuration, and if the transition 
matrix fulfills the detailed balance condition: 



W^P^0 = W0P0^ ya,(3, (27) 

then the stationary state can be shown to approach the desired distribution: Pa{t) — > Wa/Z for t ^ 00. We used 
a short-hand notation Wa for W{a). In other words, when the length of the Markov chain becomes infinite the 
probability of occurrence of graphs in the Markov chain becomes proportional to their statistical weights and becomes 
independent of the initial configuration. Therefore the average over graphs generated in this Markovian random walk 
is a good estimator of the average over the weighted ensemble. The price to pay for generating graphs in this way 
is that the consecutive graphs in the Markov chain may be correlated with each other. Therefore one has to find a 
minimal number of steps for which one can treat measurements on such graphs as independent. 

One should note that the only characteristics of the Markov process which has a physical meaning from the point 
of view of the simulated ensemble is the stationary distribution. All other dynamical properties of the random walk 
which are encoded in the form of transition matrix P{a [3) are irrelevant. Many different transition matrices P may 
have the same stationary distribution. Indeed, many of them fulfill the detailed balance condition for given weights 
Wa H27|l . The best known choice of P is 

P„;3=min(l,-|^l. (28) 



This choice is quite general and can be used in many different contexts. It is called Metropolis algorithm. For the 
current configuration a one proposes a change to a new configuration (3 which differs slightly from a and then one 
accepts it with the Metropolis probability H28|l . One repeats this many times producing a chain of configurations. 
The proposed modifications should not be too large since then the acceptance rate would be small. Therefore the 
algorithm makes only small steps (moves) in the configuration space which form a sort of weighted random walk. 



Now, we want to apply this method to generate Erdos-Renyi graphs. Let us begin with the canonical ensemble 
with N, L fixed. A good candidate for elementary transformation of graph is rewiring of a link as shown in Fig. 01 
because it does not change A'^ and L. As mentioned before it is convenient to introduce a representation in which each 
undirected link is represented by two directed links. The rewiring is done in two steps First we choose a directed 
link ij and a vertex k at random. Then we rewire the link ij to ik. If there is already a link between i and k or if the 
vertex k coincides with i, we reject the rewiring since it would otherwise lead to a multiple- or self-connection. One 
should note that the result of rewiring ij is not the same as of rewiring ji. The move is accepted with the Metropolis 



FIG. 4: The idea of rewiring: a random link (dotted line) is rewired from vertex j to a random vertex k (left hand side). 
Alternatively (right hand side) a random, oriented link (dotted line) is rewired from vertex of its end j to a random vertex k. 
The opposite link j ^ j is simultaneously rewired. 

probability. For the canonical ensemble of Erdos-Renyi graph this probability is equal to one since functional weights 
are Wo, = Wp = 1 in Eq. 

Let us see how rewiring transformations work in practice. Consider as an example the set of graphs shown in Fig. 
13 If we pick up the link 3 — 2 in the graph A and rewire it to 3 — 1, we will obtain the graph B. If we rewire the 
link 2 — 3 to 2 — 4, we will get the graph C. So using the procedure of rewiring showed in Fig. ^we can obtain every 
graph in the ensemble. The rewiring transformation is ergodic in this set of graphs. 

To summarize, our procedure of generating graphs in this training ensemble looks as follows. We construct an 
arbitrary graph having A = 4 nodes and L = 3 links to initiate the procedure and then we repeat iteratively rewirings 
for randomly chosen links and vertices. The only restriction is that the rewirings do not produce self- or multiple- 
connected links. We keep on repeating until we obtain "thermalized graphs" . Then we can begin measuring quantities 
on the generated sequence of random graphs. 

Let us check that the described Monte-Carlo procedure indeed generates graphs with the expected probabilities 
(jSj. Let us calculate the Markovian matrix P for the rewiring procedure in this ensemble. First we calculate the 
transition probability from A to B. The graph A can be converted into B in one step if we rewire the link "b" in Fig. 
Elto the vertex 2, or alternatively the link "e" to the vertex 1. We see that for this change we can choose two out 
of six links and one of four vertices to obtain the desired change. Thus the probability of choosing links is 2/6 and 
of choosing correct vertex is 1/4, so the total probability is P{A -^B) — 2/6-1/4: — 2/24. Let us now calculate 
P{A ^ C). To obtain C from A we have to rewire "a" to 3 or "f" to 4. Thus P{A ^ C) = 2/6 • 1/4 = 2/24. We 
can find P{A ^ A) from the condition: P{A -> A) + P{A B) + P{A C) = 1. This yields P{A A) = 20/24. 
Repeating the calculations for the remaining cases we find: P{B ^ A) = 6/24, P{B ^ B) = 18/24, P{B ^ C) = 
and P{C A) ^ 6/24, P{C ^ B) = 0, P{C ^ C) = 18/24. The results can be collected in a transition matrix: 



VI. MONTE-CARLO GENERATOR OF CANONICAL ENSEMBLE 





20 2 2 
6 18 
6 18 



(29) 



We can now determine the stationary probability distribution of the Markov process as a left eigenvector to the 
eigenvalue one of the transition matrix P. We obtain pA '■ Pb ■ Pc = 3 : 1 : 1 in agreement with Eq. This is not 
surprising since P satisfies the detailed balance condition H27|l and the corresponding changes are ergodic. 



a •./ 



b f 
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FIG. 5: The representation of graph A in Fig. |21as directed graph. 



We have checked above by explicit calculation that the algorithm gives correct weights of Erdos-Renyi graphs 
for = 3, L = 4. One can give a general argument that for any N, L the algorithm generates labeled graphs 
which are equiprobable. Suppose that we have a certain labeled graph a and want to get P by rewiring ij to ik. 
(see Fig. 0J). The total probability P{a 0) can be written as a product of two factors: the probability Pc of 
choosing a particular candidate for a new configuration and the probability Pa of accepting it. Because we choose 
a link i ^ j from 2L possible directed links and a vertex k from N vertices we have Pc = 1/{2LN). Inserting 
P{a' 13') = 1/{2LN) Pa{a' 13') in the Eq. (EZJ and similarly for a' ^ (3' we get 



,Pa{a ^ 13') = Wf,,Pa{l3' ^ a'). 



(30) 



But Wa' = l/Nl for all labeled graphs, thus Pa{a' (3') = Pa{f3' — > a'). This means that every move should be 
accepted unless it violates the multiple- or self-connections constraints. The rejection does not change the frequency 
of the occurrence of simple graphs but only restricts the space of sampled graphs to what we need. The weights of 
(unlabeled) graphs a are in this case w{a) = n{a)/N\ where n{a) is the number of distinct graphs of a. 
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TABLE I: Theoretically calculated weights Wc, of graphs in the canonical ensemble A'' = 5, L = 4 are normalized to ensure 
probabilistic interpretation: pa ~ lUa/^^w^, and compared with the experimental frequencies in the Markov chain using 
algorithm of rewiring. The results differ by the number R of rewirings between consecutive measurements. 



Let us numerically test the algorithm. In table ^ we compare the weights calculated analytically and computed 
from Monte-Carlo generated graphs for = 5, L = 4. There are six different graphs in this ensemble. We generated 
10^ graphs. Each of them was obtained from the previous one by R rewirings, or more precisely by R attempts of 
rewiring js^ . 

As we can see in table ^ the frequency of occurrence of each graph is in an excellent agreement with the expected 
weights. The results do not depend on the separation R between the measurements. In the chain of 10^ graphs, 
each graph in this ensemble is produced many times. For larger ensembles the algorithm would not be able to visit 
all graphs since the number of graphs is very large In this case the algorithm would choose only those graphs 
which are most representative. To make sure that the algorithm has reached the stationary distribution one should 
start a couple of random walks from different corners of the configuration space and run the algorithm so long as the 
statistical properties of graphs generated in all the random walks become identical. 

Generalization of the algorithm to a weighted ensemble is straightforward. We insert the statistical weights Wa 
of this ensemble into the Metropolis formula (j2Hl- Consider in particular a product weight 1)19(1 . We choose a link 
ij and a vertex k on the current configuration a at random and attempt to rewire the link to ik to obtain a new 
configuration /3. The change is accepted with the probability 



The degrees qj, qu are taken from a. Clearly the rewiring changes the degrees qj qj — 1, qk ^ qk + 1, and leaves 
the degrees of remaining nodes intact. The ratio Wfj/Wa can be calculated for any form of statistical weights, so the 
algorithm is general. 



VII. MONTE-CARLO GENERATOR OF GRAND-CANONICAL ENSEMBLE 



The rewiring procedure described in the previous section does not change and L. If we want to simulate graphs 
from a grand-canonical ensemble for which L is variable, we have to supplement the set of elementary transformations 
in the algorithm by transformations which change the number of links. We can introduce two mutually reciprocal 
transformations: adding and deleting a link. Both they preserve the number of nodes N but change the number of 
links: L ^ L±l. The two transformations must be carefully balanced. On a given graph a we have to choose one of 
them. Let the link addition be selected with the probability p+ and the removal with p_ . Once the move is selected 
we have to choose a link-candidate for which the move is to be applied. It is convenient to split the total transition 
probability into three factors: 

P{a ^ /3) = p±Pe(a ^ /3)P„(a ^ (32) 

where p± stands for one of Pc{ci — * /3) for the probability of choosing a candidate configuration for the 

change and Pq(q; (3) for the probability of accepting the move. Let a and f3 be two graphs which differ by a link 
which is present on /3 but absent on a: L{a) = L{(3) — 1. The transition probability for adding a link to a has to be 
balanced with the probability of removing the link from (3. In order to add a link to a we have to choose two vertices 
to which the addition of a link is attempted. The probability of choosing a given pair of vertices, if we choose two 
vertices independently, is Pc{a fi) = 2/N'^. Thus the total probability of this move is 

Pc,0 = P{a^l3)=p+^Pa{a^(3). (33) 

In the reciprocal transformation we have to choose this link among all links. The probability of choosing one among 
L links is Pc(/3 — > a) = = l/(ia + !)• Thus the total probability of this move is 

P0a = P{P -^a)=p.^ Pa{/3 ^ a). (34) 

Now we have to insert the last two equations to the detailed balance condition which for the grand-canonical ensemble 
additionally includes the factor e~^^: 

W^e-''''"P^0 = Wpe-'^'^^Ppc. . (35) 



Using this we can calculate the ratio 



Paja ^ /?) , ,P- 



If one chooses the same number of attempts for adding and removing a link: p+ ~ then the ratio p+/p- = 1 will 
disappear from the last equation and the acceptance probabilities for adding or removing a link in the Metropolis 
algorithm will read 

P.ia ^ 0) = min {l, exp(-^) ^} , (37) 

and 

PaiP - a) = min 1 1, exp(+M) ^ ^ } , (38) 

respectively. As before if we want to produce only simple graphs we must have an additional condition which eliminates 

moves leading to self- or multiple connections. The algorithm is complete. One should note that there is no reason to 
do additional rewirings because a rewiring of a link ij to a link ik is equivalent to removing the link ij and adding ik. 

In principle one could propose other algorithms. For example, one could consider a modified algorithm in which 
the move removing a link is done in a different way. Instead of picking up a link as a candidate, one could pick up 



two vertices at random, and then remove a hnk if there is any between them. The probabihty of choosing a pair 
of vertices would be and it would cancel with the identical factor for the probability of choosing candidates 

in the move adding a link. The fractions N"^ /2L and iL/N"^ would in this case disappear from equations 1)37(1 and 
(|38|l . The two algorithms of course generate the same ensemble. However, the modified algorithm would have much 
worse acceptance rate for sparse networks since the chance that there is a link between two randomly chosen vertices 
on a sparse graph is very small. Most of the chosen pairs of vertices are not connected by a link and therefore the 
algorithm will do nothing since there is no link to remove. 

This problem is absent for the algorithm which we described previously since in that case only existing links are 
chosen as candidates for removal. One can easily estimate that the probability of accepting a link removal H38|l is 
not very small. Indeed, even for sparse graph the factor e^^2L/N'^ in Eq. ((38|l is of order unity. In this case both 
exp(/z) and L for large N grow proportionally to N and their product balances the factor iV^ in the denominator. 
The algorithm has a finite acceptance rate which does not vanish when the system size grows. 

As an exercise, let us consider an example of unweighted iWa = 1) graphs with N — ?>. This ensemble consists of 
four graphs shown in tabled Their statistical weights can be easily found to be 1/3!, 3e~''/3!, 3e^^^/3!, e^'^^/3!, so 
we expect that the frequency of occurrence in random sampling should be 1 : 3e^^ : 3e^^^ : e^"^^. As we see in table 
im the results of Monte-Carlo simulations are in perfect agreement with this expectation. 
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TABLE II: Comparison of the probability distribution of graphs in a grand-canonical ensemble with A*" = 3 nodes: calculated 
theoretically and computed in a run of Monte-Carlo simulation in which 10® graphs were generated. 



One can easily apply this technique to any form of statistical weights. In particular we can consider the product 
weights ()19|l. The probability of accepting a new configuration by adding or removing a link between ij reads 

mm < 1, — — r- exp(— /i) — > tor addmg a Imk, 



2(L + i) piqMq. 



■ /i 2^ , p(g.-l)p(gj--l) ) , , , ^. , 
mm { 1, — - exp(-f/i) — ^ , \ ) for deletmg a hnk, 

where L and q;, qj refer to the current configuration. 

VIII. MONTE-CARLO GENERATOR OF MICRO-CANONICAL ENSEMBLE 

Another frequently encountered ensemble is an ensemble of graphs which have a given node degree sequence 
{qi,q2, . ■ . , 9Ar}. The partition function Z has the form: 

Z(7V,te})= (l[^{q^ia')-q.]] ^NlWia'), (39) 

a'£lg{N,L) \i=l / 

where the product of delta functions allows one to include only those graphs which have a prescribed degree distribution 
qi. As before the factor 1/N\ is fixed in this ensemble and could in principle be skipped. The canonical partition 
fmiction Z{N, L) is related to the micro-canonical ones: 

oo oc 

Z{N,L)^ ^ ••• J2 Z{N,{q,})S[qi+q2 + --- + qN-2L]. (40) 

91=0 qN=0 

To generate graphs from micro-canonical ensemble one has to have a Markov process preserving node degrees. The 
main idea is to combine simultaneous rewirings as shown in Fig. El We shall call this combination "X-move". At 
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FIG. 6: The idea of "X-move"; two oriented links (dotted lines) ij and kl chosen in a random way are rewired, exchanging 
their endpoints. Then the opposite links (solid lines) are also rewired. 



each step one picks up two random links: ij and kl, and rewires them to il and kj. This procedure is ergodic, i.e. it 
explores the whole configuration space. Such a transformation was discussed in j27| where it was used to randomize 
graphs with a given nodes' degree sequence. In that case the functional weight was Wa = 1 and rewirings were done 
with probability equal to one. In general case if one considers non-trivial Wa, one has to accept the change with a 
corresponding Metropolis probability H28|l . In this way one can for example generate graphs whose statistical weights 
depend on the number of triangles. In a sense one can perform a weighted randomization of networks with the given 
node degree sequence. Introducing a weight into randomization may be very important in the construction of scoring 
functions in problems of motif searching 28, 29, 30] . If one tries to determine relations between structural motifs and 
the functionality of network, it is very important to properly construct scoring function which may clearly account 
for the existence of a particular subgraph on a network and its function. Scoring functions are usually measured as 
a sort of distance between a network which displays a certain function and a random network which does not. An 
important problem in such studies is how to construct those networks which should serve as the background reference. 
The simplest idea is to use networks obtained by uniform randomization. This may however introduce some bias and 
may be misleading. Imagine for example that a motif which is responsible for a certain network function is built out 
of a couple of triangular loops and that triangular loops alone have no function. It is clear that one would like to 
control the abundance of triangular loops to distinguish between specific motifs and motifs which are more frequent 
by pure chance just because of higher abundance of triangles. Therefore it may be important to control the number 
of triangles in the randomized reference networks used in the scoring function. It was just an example, but in general 
case it might be useful to perform a weighted randomization taking into account some desired features of reference 
networks. 



IX. GRAPH GENERATOR AND ADJACENCY MATRICES 



All the elementary transformations: rewiring, adding or removing a link, and the X-move have a simple represen- 
tation in terms of adjacency matrices. Rewiring relies on picking up at random an element Aij = 1 of the adjacency 
matrix and flipping it with an element Aik = so that after the move Aij = and Aik ~ 1. For undirected links 
adjacency matrices are symmetric and therefore at the same time one has to flip Aji — 1 and A-^i = 0. To add a 
vertex one chooses at random A^ and if A^ — and i ^ j, one changes it into Aij = 1 (and for Aji correspondingly). 
To remove a link one picks up a non- vanishing element Aij — 1 and substitutes it with Aij = 0. To perform X-move 
one picks up two non- vanishing elements of A at random, say Aij — 1 and A^i — 1, and flips them simultaneously 
with Ail = and Akj — to: Aij = 0, A^i = 0, An = 1 and Akj — 1. Of course one also flips their four symmetric 
counterparts. In practice, when simulating sparse graphs one does not use the matrix representation since it would 
require N"^ storage capacity. For sparse matrices the number of non-vanishing matrix elements is proportional to N 
and one can use a linear storage structure. It directly corresponds to the underlying graph structure. Using linear 
storage one can code graphs having of order 10^ nodes or even more on a PC. 



X. DEGENERATED GRAPHS (PSEUDOGRAPHS) 

In previous sections we described ensembles of simple graphs. Let us now discuss pseudographs that is graphs which 
may have multiple- and self-connections. 

A degenerate undirected pseudograph can be represented by a symmetric adjacency matrix A whose off diagonal 
entries Aij count the number of links between vertices i and j, and the diagonal ones An count twice the number 
of self-connecting links attached to vertex i. For example, the graph depicted in Fig. [3 has the following adjacency 



matrix: 



A = 



/ 1 1 \ 

2 

1 

2 

V 1 2 / 



(41) 



In the representation where each undirected hnk is represented as two opposite directed Hnks all matrix elements 





FIG. 7: Left hand side: the example of pseudograph with A'' = 6, L = 5. Right hand side: its representation as directed graph. 



including the diagonal ones count the number of directed links emerging from the vertex. 

As before let us first consider labeled pseudographs. However, in order to have a unique representation of a graph 
one has to label links as well. We did not have to do this for simple graphs since in that case each link was uniquely 
determined by its endpoints. It is not anymore the case for degenerate graphs since there may be more than one link 
between two nodes. A pseudograph with N nodes and L links can be fully labeled by N node labels and 2L labels of 
directed links. Each fully labeled graph has thus the configuration space weight equal to l/(iV!(2L)!). Let us work 
out the consequences of this choice. Denote a a graph, a' a labeled graph of a with labeled nodes only, and a" a 
fully labeled graph of a with labeled nodes and labeled links. From here on labeled graph means a graph which has 
only labels on nodes while fully labeled graph a graph which has additionally labels on directed links. 

The configuration space weight of a can be calculated as a sum over all fully labeled graphs a" as follows: 



E 



1 



a"eflg{a} 



N\{2L)\ 




(42) 



where flg{a) denotes the set of fully labeled graphs of graph a, lg{a) the set of labeled graphs (labeled nodes only) 
of graph a . The expression Aii/2 counts the number of self-connecting links attached to vertex i, and Aij is the 
multiplicity of links connecting i and j. If there are no self-connections {An = 0) and no multiple connections 
{Aij < 1), the configuration space weight reproduces the weight of simple graphs. One can easily understand 
the appearance of the combinatorial factors in general case. Suppose that we permute links' labels of a fully labeled 
graph leaving nodes' labels intact. Among all (2L)! permutations not all are distinct. If we have Aij links between 
vertex i and j and we will permute their labels, then all Aij\ permutations will give the same labeled graph (if we 
simultaneously permute labels of the directed partners) . Similarly if we have vertex with a self-connection and we 
exchange labels of the two directed links emerging from this vertex, the fully labeled graph will not change. Thus for 
each self-connection two permutations lead to the same fully labeled graph. To summarize, the number of distinct 
permutations of link labels is reduced from (2L)! by dividing out the factor 2 for each self-connection and k\ for each 
/c-link multiple connection which just gives Eq. 142|) . It turns out that these weights are identical to the combinatorial 
factors of Feynman diagrams which appear in perturbative series of a mini-field theory 0. One can thus interpret 
random pseudographs as Feynman diagrams and use perturbation theory to enumerate them. 

Let us consider as an example a canonical ensemble of pseudographs with A'^ = 3 and L = 3. There are 14 graphs in 
this ensemble. They are shown in Fig. |S| In table IIIll we compare the theoretically calculated probability distribution 
of graphs: 



Pa 



(43) 



using the weights calculated by the formula (|42|l with the probability distribution obtained experimentally from the 
frequency histogram of graphs produced by the Monte-Carlo generator. Now the generator works exactly as before 
except that it does not reject moves leading to a self- or multiple-connections. The results are in perfect accordance. 
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FIG. 8: All pseudographs in the canonical ensemble with = 3, L = 3. 
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TABLE III: Gomparison of theoretical and experimental (Monte-Garlo) computations of frequencies of graphs' occurrence in 
the ensemble with A^ = 3, L = 3 . 
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As an example let us calculate the weight of graph M in Fig. |H1 The weight of each labeled graph of graph M, 
according to the formula 142|) . is equal 

111 1 

"^^^'^3! '2^2! = 96' ^^'^ 

where the first factor comes from 1/N\, the second from the three self-connections, and the third from the fact that the 
two self-connections are attached to the same vertex and thus can be permuted without changing graph's connectivity. 
There are six distinct labeled graphs M' of graph M and thus 



^ 1 _ 6 _ 1 



'^^'^ ^ 96 96 16 

M' 

One should note that the number of distinct labeled graphs varies from graph to graph. For example, for graph L 
there is only one labeled graph. In this case wl' = 1/3! • 1/2^ — 1/48 and wl = 1/48. The calculation can be easily 
repeated for each graph in Fig. |S1 yielding the weights Wa hsted in table IIIII 

As follows from Eq. H42|l . the partition function for the canonical ensemble of pseudographs can be written in three 
different ways: 



Nl(2Ly. ^ iV! U-L 2'4../2(A,j2)! / -IJ-A,,-! 

a"eflg{N.L) ^ ' a' elg(N ,L) \ i \ "i J / .^yj i-J aea{N ,L) 

The first sum runs over fully labeled graphs and has the simplest form since all fully labeled graphs have the same 
weight. The weight of labeled graphs in the second sum varies. We note that labeled graphs are isomorphic with 
adjacency matrices that is each labeled graph is uniquely represented by a generalized adjacency matrix jSTj like Eq. 
()41|l . The sum over a' G lg{N,L) can be thus interpreted as a sum over all generalized adjacency N x N symmetric 
matrices A such that J^ij ^ij = 2L. We see that not all adjacency matrices have the same statistical weight unlike 
for simple graphs since the weights depend on the number of self- and multiple-connections. 

A weighted ensemble of pseudographs is constructed as before by introducing an additional functional weight W{a) 



under the sum defining the partition function (|46|) : 



Z(N,L)^ - E ^"^(")- (47) 

a"eflgiN,L) >' a€g{N,L) 

As before the functional weight W{a") does not depend on graph's labehng but only on graph's topology. In other 
words if a'/ and 03 are two different fully labeled graphs of graph a then W{a'{) = W{a2) = W{a). 

We can now consider various weights: for example a product weight as in Eq. H19|) to mimic graphs with uncorrelated 
node degrees. But even in this case the total weight does not factorize since the configuration space weight w{a) 
written as a function of node degrees w{qi,q2, ■ ■ ■ ^Qn) does not factorize. Due to the absence of the structural 
constraints the approximation given by equations H2()|l and H21(l has now much weaker finite size corrections. 

A grand-canonical ensemble for pseudographs with arbitrary product weights (|19|1 has the following partition 
function: 

N 

Z(iV,Ai) = E^^P(-^^) E ^c.l[p{q^{a)). (48) 

L aeg(N.L) i=l 

This means that all pseudographs with fixed nodes' degrees {qi} have the same functional weight ^ p{qi) ■ ■ ■ p{qN), 
which seems to be similar to that generated by the MoUoy-Reed construction of pseudographs [SJ . Let us comment on 
this. In the MoUoy-Reed construction one generates a sequence of non-negative integers {gi, (72, ... , 9Ar} for example 
as independent identically distributed numbers with the distribution p{q). One interprets g^'s as node degrees. The 
only requirement is that the sum qi + q2 + ■ ■ ■ + qN = 2L is even. In the first step of the construction each integer qi 
is represented as a hub built out of a vertex and qi outgoing branches which can be viewed as directed links emerging 
from this vertex. In the second step the directed links are paired randomly in couples of links in opposite direction to 
form undirected links connecting vertices. This procedure generates the same subset of pseudographs as the partition 
function Z{N,ii) (|48|l . However, statistical weights are different. 

To see this, let us consider a subset of Molloy-Reed graphs obtained for a given set {qi}. There are N labeled 
vertices and 2L = qi labeled directed links. All permutations of labels of links and nodes are equiprobable exactly 
as it was before for fully labeled pseudographs (|42|l . If one calculates corresponding symmetry factors for node-labeled 
graphs the same combinatorial factors arise as in Eq. H42I) : if one pairs two directed links a and b which belong to 
the same vertex one obtains a self-connecting link. The pair ab is identical as ba since both the links begin and end 
at the same vertex. This reduces the number of distinct permutations by factor of 2 as in Eq. (|42(l . Similarly for k 
pairs of directed links between two vertices one can exchange the order of pairing in k\ ways each time obtaining the 
same multiple connection, so the corresponding factor is 1/fc! again as in Eq. (|42|l . The conditional probability of 
choosing a particular graph a under the condition that in the first step of the construction the set of {gi, g2, ■ ■ ■ , 'Zw} 
has been selected, is 

WM-R{{q^}\a) = = , (49) 

2^l3eg{qi,...,qN} 

where the sum is done over all (unlabeled) pseudographs j3 from the micro-canonical set of fixed degrees {gi, . . . , g^v}- 
The total probability is thus 

WM-R[a.) = P[{qi\)wM-R{{qi}\a), (50) 

where P{{qi}) is the probability that in the first step of the construction the set {gi, . . . , gAr} is selected. This 
probability is proportional to the product of p(gi)'s multiplied by the number of permutations of {qi, . . . , gjv} which 
give the same set. We denote this number by Perm(gi, . . . , gjv). The order of g^'s does not matter since we consider 
unlabeled graphs. For example, the following permutations (sequences): ((71,(72,93) — (3,3,2), (3,2,3) and (2,3,3) 
give the same set {3, 3, 2}, so in this case we have Perm(3, 3, 2) = 3. In general, the number is given by 

iV! 

Perm((7i, . . . ,5Ar) = — | — | , (51) 

n^mii ■ ■ ■ 

where no, ni, . . . are degree's multiplicities: Uq — 6 [qi — q\. Thus 

P{{q^}) « ( n^fel")) ) Perm({gO). (52) 
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FIG. 9: A set of 10 pseudographs for A*' = 3 and p{q) = 1/3 for g = 0, 1, 2. Top: three hubs for {qi} — {2, 2, 2} are generated 
and then directed links are paired randomly giving three pseudographs A1,A2,A3. Bottom: the rest of pseudographs from this 
ensemble. 
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TABLE IV: Weights calculated for the Molloy-Reed construction and for the corresponding grand-canonical ensemble with 
A = 3. For each sequence §1,52,93 we give the combinatorial number Perm (gi, 52, 93) of permutations leading to the same 
graph. Altogether, there are 14 different sequences of length three, with = 0, 1 or 2 as can be seen in the third column of 
the table. 



Collecting all the factors together and normalizing to have the probabilistic interpretation we obtain the following 
expression for the total weight (probability) for MoUoy-Reed's pseudographs: 

( \ (r^ ( ( \\\ Perm((7i (a), ... ,5Ar(a)) , . 

WM-R{a) = \\p{qi{a)) ^ 77 77^ ■ 53) 

The first factor comes from picking TV numbers qi at random, the second counts permutations and the third includes 
the weight generated by pairing directed links. As we see, despite many similarities the Molloy-Reed ensemble and 
the grand-canonical lead to different weights. As an example, in Fig. Elwe show an ensemble of 10 pseudographs with 
= 3 generated by Molloy-Reed algorithm for p{q) = 1/3 for g = 0, 1, 2 and zero elsewhere. We compare statistical 
weights of the generated graphs with the corresponding ones in the grand-canonical ensemble. As we can see in table 
Hvl the weights are different in the two ensembles. 



Summary 



We have discussed a statistical approach to homogeneous random graphs. This framework is a natural extension 
of the Erdos-Renyi theory to the case of weighted graphs: one considers the same set of graphs but with modified 
statistical weights. The statistical weights of homogeneous graphs depend only on graphs' topology. In other words, if 
one assigns some labels to its nodes, they will have no physical meaning similarly as the numbers of indistinguishable 
particles in quantum mechanics. One can permute them and the graph and its statistical weight will stay intact. The 
only information which matters is the number (entropy) of distinct permutations of nodes' labels. All permutations 
of node's labels are equivalent, unlike for growing networks where those permutations have to preserve the causal 
order corresponding to the order of node's attachment to the graph. The statistical weight of a homogeneous graph 
is proportional to the number of all labeled graphs of this graph while of a growing network to the number of 
causally labeled graphs. This leads to a difference between homogeneous and growing networks. For example, 
a typical homogeneous graph has a larger diameter than the corresponding growing network with the same node 



degree distribution H13() . Generally, geometrical properties of homogeneous graphs are different from those of growing 
networks for which correlations between the time of node's attachment and its degree induce node-node correlations 
of a specific type ,12, JJJ. Such correlations are absent for homogeneous graphs. 

Various functional properties of homogeneous networks can be modeled by an appropriate choice of functional 
weight. One can easily produce networks with an assertive mixing, higher clustering or any desired property which 
can reflect any real-data observation. 

Homogeneous networks can be simulated numerically. We have also described a Monte-Carlo algorithm to generate 
canonical, grand-canonical and micro-canonical ensembles which performs a sort of weighted random walk (Markov 
chain) in the configuration space with a desired stationary distribution. We advocated the importance of the possibility 
of generating random networks with desired statistical properties for advanced motif searching |28i l29t 'Sdj . 

Many real networks have resulted from hybrid processes of growth mixed with some thermalization. The framework 
discussed in this paper can flexibly extrapolate between the two regimes. It allows one to directly investigate the 
relation between structural and functional properties of complex networks. 
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